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ABSTRACT 

We present an alternative approach to identifying and characterizing 
jet substructure. An angular correlation function is introduced that can 
be used to extract angular and mass scales within a jet without reference 
to a clustering algorithm. This procedure gives rise to a number of useful 
jet observables. As an application, we construct a top quark tagging 
algorithm that is competitive with existing methods. 
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1 Introduction 



In preparation for the LHC, the past several years have seen extensive work on 
various aspects of collider searches. With the excellent resolution of the ATLAS and 
CMS detectors as a catalyst, one area that has undergone significant development 
is jet substructure physics. The use of jet substructure techniques, which probe the 
fine-grained details of how energy is distributed in jets, has two broad goals. First, 
measuring more than just the bulk properties of jets allows for additional probes of 
QCD. For example, jet substructure measurements can be compared against preci- 
sion perturbative QCD calculations or used to tune Monte Carlo event generators. 
Second, jet substructure allows for additional handles in event discrimination. These 
handles could play an important role at the LHC in discriminating between signal and 
background events in a wide variety of particle searches. For example, Monte Carlo 
studies indicate that jet substructure techniques allow for efficient reconstruction of 
boosted heavy objects such as the W ± and Z° gauge bosons pEHl], the top quark 
[5HTU]. and the Higgs boson [TTWT6] . 

At least two broad classes of jet substructure techniques have been developed. 
The first class employs jet shape observables to probe energy distribution in jets. 
The second class makes use of the clustering tree of a jet as constructed by the 
Cambridge-Aachen (CA) pT] or kr [IE] sequential jet clustering algorithms to identify 
and characterize subjets within the jet. 

Jet shape observables offer a measure of how energy is distributed within a jet. 
The energy distribution of a jet is determined by a variety of factors, including heavy 
particle decays, color flow, and the dynamics of the parton shower. Different jet 
shape observables have been constructed to quantify these [T9H21] and other aspects 
of jet substructure. Infrared and collinear (IRC) safe observables can in principle 
be computed in perturbation theory or modeled with Monte Carlo simulations and 
then compared to experimental results. Combining different jet shape observables has 
been shown to provide for effective discrimination in a variety of different scenarios 
(see e.g. |25j). A disadvantage of jet shape observables is that, because they can only 
be computed once the constituents of the jet have been defined, they cannot be used 
to determine how to most effectively select jets within a given event. In particular 
a jet shape observable is only as good as the choice of particles that define the jet. 
As a result jet shape observables do not offer a way of selectively removing likely 
contamination from underlying event or pile-up^} 

The CA and kx sequential jet algorithms are defined by metrics that have 
been chosen with the goal of constructing clustering trees that closely approximate 
the perturbative QCD parton shower. The first few branches of the clustering tree can 

'''See however [25] , 
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be used to decompose a jet into subjets. This unclustering procedure has seen a wide 
variety of phenomenological applications, especially in the context of tagging jets that 
result from boosted heavy particle decays, e.g. filtering in boosted Higgs searches [TTj . 
A closely related procedure, referred to as pruning [27], vetoes on QCD-like branches 
with the goal of sharpening jet mass resolution. This family of procedures offers a 
number of tunable parameters, allowing the user to control how much and what kind 
of substructure is identified. A disadvantage of these procedures is that, in order for 
them to be most effective, the clustering tree must accurately reconstruct the parton 
shower history of the jet. In practice the CA and algorithms reconstruct the most 
probable shower history, which need not coincide with the actual shower history. In 
addition, the parameters which define the unclustering typically impose a hard line 
between QCD-like behavior and non-QCD-like behavior that can fail to accommodate 
jets that deviate too much from "most probable" jets. 

The goal of this paper is to explore an alternative procedure for identifying and 
characterizing substructure within jets. The discussion is organized as follows. In 
Section 2, we introduce the "angular correlation function" Q(R) and discuss how 
structure in Q(R) can be used to construct IRC safe jet observables. In particular 
we use G{R) to extract angular scales R* and mass scales m* directly from the con- 
stituents of a jet without use of a clustering tree. These angular and mass scales 
correspond to the angular separations and invariant masses of pairs of hard substruc- 
ture in the jet. In Section 3, we present an application of these ideas to the tagging of 
boosted top quarks. We find that the resulting top tagging algorithm is competitive 
with other methods in the literature. Given the straightforward approach we take in 
applying Q{R) to top tagging, this good performance 'out of the box' is encouraging. 
In Section 4 we discuss other possible applications of the methods introduced in this 
paper. 



2 Angular Correlation Function 

To characterize substructure in a jet J we define the angular correlation function 
Q{R) as 

E PnPTj A R%j O ( R - ARij) E PvpMR ~ ^Rij) 
Q{R) - ' 



YsPTiPTj^Rij T.Pi'Pj 

where the sum runs over all pairs of constituents of J and Q(x) is the Heaviside step 
function. Here pt% is the transverse momentum of constituent i, and AR^ is the 
Euclidean distance between i and j in the pseudorapidity (rj) and azimuthal angle 
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Figure 1: The angular correlation function Q{R) for a sample top jet. 

(0) plane: Ai??- = (rji — rjj) 2 + (fa — 0j) 2 . On the LHS of Eq. (TTT) the dependence 
on transverse momenta is fixed by collinear safety. Provided that ARij is raised to a 
positive power, the entire expression is IRC safe. We choose Ai? 2 in Eq. ([!]) so that 
G(R) has a clear physical interpretation: Q(R) is the (fractional) mass contribution 
from constituents separated by an angular distance of R or less. An important point 
here is that R does not mark the distance with respect to any fixed center. 

For a jet with no substructure, Q(R) is featureless. In contrast, if a jet has 
significant substructure at an angular scale R = i?*, Q(R) exhibits a discontinuous 
cliff at R = i?*, see Fig. [Tj Such a cliff corresponds to two or more hard subjets 
separated by a distance R* from one another, with the cliff height determined by 
the invariant mass of the subjets. Notice that these cliffs are closely related to mass 
drops as exploited in a variety of jet substructure studies [8rlT2]. We expect that a 
typical QCD jet will have an angular correlation function that is more or less smoothly 
varying without any sharp cliffs, while for a jet with significant substructure G(R) 
will have one or more sharp cliffs at angular scales R = R* corresponding to distinct 
separations between hard subjets in the jet. This suggests several jet observables 
that can be defined from Q(R). Given a procedure for finding cliffs in Q(R), we can 
consider: (i) the total number of cliffs; (ii) the angular scales R = R* at which cliffs 
are found; and (iii) the cliff heights at each R = R*. We will see that, once suitably 
defined, each of the resulting observables proves useful in characterizing substructure 
within jets. 

In effect, Q(R) defines a continuous family of jet shape observables. Each Q(Rq) 
for a given R Q differs from most jet shape observables in that: (i) it does not contain 
any preferred or reference four- vectors (e.g. the energy center of the jet); and (ii) 
it involves a sum over two-particle correlations. For example, the radial jet energy 
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Figure 2: px plot and angular structure function AQ(R) for the top jet whose Q{R) is 
illustrated in Fig. [T] (a) The px plot depicts the transverse energy deposited in calorimeter 
cells of size 0.1 x 0.1 in (77, <f>) with the area of each red square proportional to the px- 
This top has px ~ 300 GeV and a clean three-pronged substructure, (b) For a minimum 
prominence of 4.0, AQ(R) has three peaks with Ri* = 0.66, R2* = 0.91, and R3* = 1.48. 
The red arrows illustrate the prominence of the two peaks at i?2* and -R3*. 

profile if)(R) as in [281 I2H] quantifies the fraction of a jet's energy that is contained 
within an angular distance R of the center of the jet. Although ip{R) for a top jet 
will exhibit discontinuous cliffs at particular angular scales, these scales are not useful 
for characterizing the substructure of the jet. This is because the resulting angular 
scales, which are defined with respect to the jet center, cannot be used to reconstruct 
the separations between the three top subjets. In addition, the invariant masses of 
pairs of subjets are not accessible from ip(R). The angular correlation function Q(R) 
is closer in spirit to factorial moments as in [30], which were introduced to quantify 
scaling behavior in multi-particle production. 

In order for the observables derived from Q{R) to be useful, care must be taken 
in defining them. We find that, instead of directly finding cliffs in Q(R), it is prefer- 
able to find peaks in a suitably chosen derivative of Q(R). In particular, because 
we are interested in ratios of mass scales, we should look for structure in log(?(i?fj 




Because QCD is approximately scale invariant, structure in log Q(R) should be iden- 
tified by calculating derivatives with respect to logi?. Since d/dlogR = R d/dR, 

* The normalization in Q{R) has been chosen with this logarithm in mind: Q{R) increases mono- 
tonically from to 1 as R increases from R = to R = max ARij . 
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Figure 3: An illustration of how prominence requirements, by selecting peaks that stand 
out above background noise, prevent angular scales from being double-counted. 



this choice ensures that noise in \ogQ(R) at small R does not result in extraneous 
peaks. This suggests that the quantity of interest is d log Q (R) jd log R. A concern 
with d\ogQ(R)/d\ogR is that the derivative produces a delta function S(R — Ai2y); 
as a consequence, d log Q(R)/d log R defines a noisy function of R. Therefore, to 
identify structure in \ogQ(R) we define an "angular structure function" AQ(R) by 
replacing the delta function in d log Q (R) jd log R with a smooth kernel K(x): 

J2 PTiPTjARf j K(R - AR i:j ) 
Ag{R) = R I] PTiPTjAR'- j &(R - AR^) (2) 



In the following we choose a gaussian K(x) = e~ x / dR /VirdR 2 with dR = 0.06. We 
find that this choice reduces noise substantially. This value of dR was selected after 
scanning a range dR G [0.02,0.12] and choosing dR to maximize the performance of 
the top tagging algorithm presented in Sec. 3. 

To identify angular scales R = R* in the jet that correspond to distinct hard 
substructure in the event, it is important to find peaks in AQ(R) in a way that is 
robust against noisej^] For this purpose we borrow a concept from geography called 
(topographic) prominence [31J. The prominence of the highest peak is defined as 
its height. In the mountaineering analogy, the prominence of any lower peak P 
is defined as the minimum vertical descent that is required in descending from P 
before ascending a higher, neighboring peak P', where P' can lie to either side of P. 
Fig. 2(b) illustrates this concept for two different peaks. In Fig. [3] we illustrate how 



using prominence instead of height to identify physical peaks can eliminate extraneous 
peaks that are artifacts of the detector's finite angular resolution. The pictured jet 
has two distinct hard subjets separated by a single angular scale AR. Since one of 
the subjets has its energy deposited in two neighboring calorimeter cells, the angular 
structure function AQ (R) exhibits two distinct peaks in the neighborhood of R = AR. 
Only one of the two peaks has a large prominence, and so using prominence to select 
peaks in AQ(R) ensures that only a single angular scale near R = AR is identified. 



^ Using the kernel K(x) reduces the noise in AQ(R) but does not do so completely. 



In the following we will identify a peak in AQ(R) by demanding that its prominence 
exceeds a minimum value ho- 

So far we have described how to define two different jet observables from prominent 
peaks in AQ(R). The first is n p , the number of prominent peaks in AQ(R). The 
second is the various angular scales R^ at which prominent peaks are located. It 
remains to define a jet observable that corresponds to cliff heights in Q{R). The 
magnitude of a cliff's height in Q(R) will map onto the height of the corresponding 
peak in AQ(R). This height is determined by the invariant mass of (typically) two 
hard subjets separated by an angular distance R = R^. For each prominent peak in 
AQ(R) with height AQ(R^) we define the partial mass m 

ml s A (3) 

where we have used Eq. [2] to extract the (appropriately normalized) numerator of the 
angular structure function. Here 

H 2 j = Y.PTrPTAR% (4) 

is the denominator of G(R) in Eq. [I] and is approximately equal to the squared jet 
mass m 2 j. To see the physics that is encoded in the partial mass consider a jet with 
two infinitely narrow, hard subjets separated by an angular distance AR and with 
transverse momenta pri and pT2- This jet will exhibit a single prominent peak in 
AQ(R) at R = AR. The corresponding partial mass m* will be given by ml = 
PtiPt2^R 2 ~ 2pi -p2^\ Thus the partial mass is a measure of the mass at a particular 
angular scale. For a jet whose substructure is determined by a heavy particle decay, 
the partial masses will be fixed by the kinematic constraints of the decay. This 
observation will be explored further in Sec. 3 in the context of top tagging. 

Now that we have defined n p , R4*, and m^, we can ask how these jet observables 
characterize the substructure of a jet. First, for an idealized jet composed of n s hard, 
narrow subjets with each pair of subjets separated by distinct angular scales we 
expect the number of peaks n p to be given by 

(5) 

In general this equality becomes an inequality n p < n™ ax for jets whose substructure 
is less clean. For example, if some of the n s subjets are wide or if some of the 
angular separations are approximately degenerate, then AQ(R) may exhibit fewer 

^Note that for two subjets j\ and j 2 that are not infinitely narrow, the gaussian kernel in Eq. [2] 
introduces some amount of smearing in the partial mass. 
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Figure 4: (a) px plot and (b) angular structure function AQ(R) for a QCD jet with diffuse 
substructure and px ~ 600 GeV. In the px plot, the small cell at the end of the arrow is 
so soft that it is barely visible. Prominent peaks in AQ(R) are distributed approximately 
uniformly in R. For a minimum prominence of 4.0, AQ(R) has a single peak at Ru = 1.09. 



Note the scale of AQ(R) as compared to the top jet in Fig. 2(b) 



than n™ ax prominent peaks. When a prominent peak is resolvable, however, the 
resulting angular scale corresponds to an angular separation between two or more 
hard substructures in the jet. For a QCD jet, the distribution of prominent peaks 
should be roughly uniform in R, since QCD is approximately scale invariant. For a 
jet that is initiated by a heavy particle decay, the angular scales _R« will be peaked at 
values characteristic of the decay kinematics of the heavy particle. The corresponding 
partial masses will be correlated to mass scales intrinsic to the heavy particle decay. In 
contrast, for QCD jets the partial masses will be peaked at small values, as determined 
by the soft and collinear singularities of QCD. 

Some of the foregoing discussion is illustrated in Figs. [2]and[4j In Fig. [2] we show 



a boosted top jet with a clean three-pronged substructure. In the px plot in Fig. 2(a) 



we 



the distances R^ between the three hardest cells are indicated. From Fig. 2(b) 
see that it is these same three angular scales that show up as prominent peaks in 
the angular structure function AQ(R). Less prominent peaks correspond to soft- 



hard correlations in the jet. The substructure of the QCD jet in Fig. 4(a) is quite 
different, with a single hard core surrounded by soft diffuse radiation. The mass of 
the jet is largely due to these soft, wide-angle emissions, and the most prominent peak 
in AQ(R) corresponds to correlations between the hard core of the jet and one such 
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Figure 5: (a) pr plot and (b) angular structure function AQ(R) for a top jet with pr ~ 500 
GeV. The decay products of the are not individually resolved, with most of the radiation 
from the (<ft ~ 2.8) contained within a single, hard cell. For a minimum prominence of 
4.0, AQ(R) has a single peak at Ru = 0.39. 



emission. Prominent peaks in AQ(R) for this QCD jet are distributed approximately 
uniformly in R, as expected. 

The close correspondence between structure in the pt plots apparent by eye and 
the structure identified by the angular structure function AQ(R) is encouraging. To 
investigate the effectiveness of this procedure more thoroughly will require testing 
it against a concrete application, where the characteristics of the observables n p , 
R iJf , and can be explored in greater detail. A good testbed will involve jets with 
complex substructure. For this reason we choose to construct a top tagging algorithm 
first application. 



3 Top tagging 



If every top jet had the clean three-pronged structure apparent in Fig. |2(a) then 
constructing an efficient top tagger would be straightforward. In practice, recon- 
struction of the top is complicated by a number of factors, including: (i) the finite 
resolution of the detector, which degrades mass and angular resolution; (ii) collinear 
radiation, which can make it difficult to resolve subjets initiated by hard partons 
that are close together; and (iii) the boost from the top rest frame to the lab frame, 
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which can result in decay products that are soft or overlap with one another. As 
a consequence, many top jets will have fewer than three prominent peaks in their 
angular structure functions. For example, in Fig. [5] we show an example of a top jet 
in which the W ± decay products do not exhibit a clean two-pronged structure. As 
a result AQ(R) only has a single prominent peak corresponding to mass correlations 
between the W and the b subjet. Constructing a tagger with high signal efficiencies 
will therefore require considering top jets with fewer than three prominent peaks in 
their angular structure functions. 

This suggests that the following procedure could result in an efficient top tagging 
algorithm. Fix a minimum prominence ho- For each candidate jet, calculate the 
angular structure function and identify the number of peaks n p with prominences 
exceeding ho- Reject candidate jets with n p = or n p > 3 and sort the rest into 
bins with n p = 1, 2, 3. Then apply separate sets of cuts to the R iit and m iif in 
each bin. This procedure has the advantage that candidate jets are being sorted 
with respect to their observed topologies. For example, top jets in which the decay 
products of the W ± are merged will be treated differently from top jets that exhibit a 
clean three-pronged substructure. In each bin cuts will be applied to the observables 
available from the identified substructure, and the cuts can be separately optimized 
to reflect the diversity of actual tops. By not requiring candidate jets to have the 
substructure of an idealized top jet with three distinct prongs, the top tagger can 
be more accommodating towards "ugly duckling" tops and thus attain higher signal 
efficiencies 

The outline of this section is as follows. In Sec. 13. II we discuss distributions of the 



observables and for top jets and QCD jets. In Sec. 3.2 we present the details 



of our top tagging algorithm. In Sec. 3.3 we describe the Monte Carlo used to test 



the top tagger as well as the performance of the algorithm. 



3.1 Observables 

To set the stage for the top tagging algorithm defined in the next section, we 
first discuss what sort of top jet discrimination is available from the observables R^ 
and mj*. In Fig. [6] we illustrate distributions for these observables in the n p = 3 
bin. For top jets the kinematic constraints of the top decay in conjunction with the 
boost to the lab frame account for the basic features (see appendix A for details). 
Identifying the smallest R*, i.e. Ru, with the angle between the b subjet and the 
closer of the W ± subjets, we expect that Ru ~ 0.25 for this 500 GeV < px < 600 
GeV bin. Similarly, identifying i? 2 * with the angle between the two W subjets and 
i?3* with the angle between the b subjet and the further of the W subjets, we expect 

"A similar flexibility is found in the multi-body filtering employed by the HEPTopTagger [§1 HOj. 
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Figure 6: Distributions for observables in the n p = 3 bin with 500 GeV < px < 600 
GeV. Distributions for top jets (QCD jets) are shown in blue (red). Angular scales and 
partial masses are ordered so that Ru < i?2* < -^3*- For QCD the R^ distributions 
are consistent with scale-invariant emission, while the mj* distributions peak towards small 
partial masses. For tops the R^ and distributions are peaked at angular and mass 
scales characteristic of top decay kinematics. Distributions are normalized to unity. 

that i? 2 * ~ 0.50 and -R3* ~ 0.75. With these identifications for the three peaks, the 
predictions for the partial masses become mi* ~ 50 GeV, 1712* ~ mw, and 777,3* ~ 140 
GeV. These predictions for the and match up well with the distributions in 
Fig. [6j although in practice the corresponding identifications only hold on the average. 
Note that the kinematic constraints of the top quark decay imply strong correlations 
between and for each i. This is illustrated in Fig. [7j where i?2* has been 
plotted against 777.2* in the n p = 3 bin. For QCD jets i?2* and 777.2* are uncorrelated. 

In contrast to top jets, QCD jets have no intrinsic scales. Since QCD is ap- 
proximately scale invariant and the derivative in AQ(R) is with respect to logi?, 
we expect the i?* distributions to be approximately uniform. Imposing the ordering 
Ri* < R2* < R3* then has the consequence that the i?i* distribution should peak at 
R = 0, the _R 2 * distribution should peak at intermediate R, and the i? 3 * distribution 
should peak towards large R. This is consistent with what is seen in Fig. [6j up to 
edge effects at large R in the -R3* distribution. The partial masses of QCD jets are 
peaked towards small m^*, as we expect given that the physics of m^* is qualitatively 



10 



200 



150 



£ 100 



50 




□ 



■ (!□□□□□□□□ □□ 



□ 



8^ 



0.5 



1.0 

R 2 * 



1.5 



2.0 



Figure 7: Correlations between i?2* and m2* in the n p = 3 bin with 500 GeV < < 
600 GeV. For the top kinematic constraints imply strong correlations between i?2* an d 
77i2*, while for QCD jets the two are uncorrected. Correlations for top jets (QCD jets) are 
depicted in blue (red). 

similar to the physics of jet masses mj. 

The features of the distributions in the n p — 1, 2 bins are qualitatively similar, see 
Fig. [9} Here it is less clear what identifications to make for the different peaks, and it 
is likely that there is a fair amount of mixing between different decay topologies. In 
any case the observables derived from AQ(R) in the n p — 1, 2 bins make effective dis- 
criminants between top jets and QCD jets, although more discrimination is available 
in the n p = 3 bin. The distributions for R u and mu in the n p — 1 bin are consistent 
with correlations between the W ± subjets jwi and jy/2', one possibility is that for 
these top jets the b subjet is too soft to yield prominent peaks. The distributions for 
the n p = 2 bin are consistent with correlations between the b subjet and each of the 
two W ± subjets; one possibility is that for these top jets the W ± subjets jwi and 
jw2 are nearly merged so that correlations between j wi and jw2 do not result in any 
prominent peaks. 
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Figure 8: The jet mass mj for tops (blue) and QCD (red) in the n p = 3 bin with 500 GeV 
<p T < 600 GeV. 
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Figure 10: Fractions of top jets (blue) and QCD jets (red) that have n p prominent peaks. 
Here the minimum prominence is ho = 4.0 and 500 GeV < pr < 600 GeV. These fractions 
exhibit only a small dependence on px- 



3.2 An algorithm 



The distributions in Figs. [6]j9] suggest that imposing cuts on rrij, R^, and m« 
could lead to effective discrimination between top jets and QCD jets. To test this 
we employ the following top tagging algorithm. Using the CA algorithm, cluster 
the event into fat jets with R = 1.5. Although a more advanced version of the 
tagger could benefit from using variable R (or a filtered jet mass m mt ), we leave the 
value of R fixed for simplicity. Before applying any cuts, first presort the candidate 
jets into px bins of width 100 GeV. Then for each candidate jet calculate AQ(R) 
and identify the number of peaks n p whose prominence exceeds a fixed minimum 
prominence ho = 4.0. This value of ho has been selected by scanning over a range 
Hq G [1.0, 10.0] and choosing h$ to minimize the background efficiency over a wide 
range of px and signal efficiencies. Within each bin further sort the candidate jets 
into three peak bins (n p = 1,2,3), throwing out jets with n p = or n p > 3. This 
n p cut removes a sizable fraction (~ 15%) of QCD jets, while rejecting only ~ 3% 



of top jets, see Fig. [TOj For discrimination between top jets and QCD jets to be 
most effective one would like to disentangle the correlations between the observables 
as much possible; for simplicity, however, we choose to make rectangular cuts in the 
space of observables. In particular, in the n p = 3 bin we choose to impose cuts on six 
of the seven available observables, excluding mi*, which is the least discriminating 
observable. More specifically, we impose the following cuts: 



1. mj > m tmin 

O E? ^ rjmax D ^ Dmax D ^- Dm; 

Z. .Ki* < H u , -K2* < J^2* 5 -"-3* < -^3* 

3. m 2 * > m™ n , m 3 * > m™ n 
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A candidate jet that passes this set of cuts is tagged as a top jet. In the n p — 1, 2 
bins we employ the corresponding set of cuts, except in contrast to the n p = 3 bin, 
we make use of all of the observables. Also, we impose an additional cut mj < m t max 
in the n p — 1 bin only, since the smaller number of observables in the n p — 1 bin 
(three) means that imposing this cut does not substantially increase the computer 
time needed to find optimal cuts. For the moment we leave the values of the cuts 
unspecified; this will be addressed in the next section. 



3.3 Results 

We use two different event samples for evaluating the performance of the top 
tagger. These event samples (from pp collisions with center of mass energy of 7 TeV) 
belong to a set of benchmark event samples that have been made publicly available by 
participants of the BOOST 2010 workshop [32]. The first event sample is generated 
by HERWIG 6.510 (33] with the underlying event simulated by JIMMY [34] , which has 
been configured with a tune used by ATLAS. The second is generated by PYTHIA 6.4 
[35] with (^-ordering and the 'DW tune for the underlying event. See (36] for more 
details. Unless noted otherwise, all results presented in this paper make use of the 
HERWIG event samples; the PYTHIA event samples were used as crosschecks. For signal 
jets we use the hardest jet in each event of a Standard Model hadronic tt sample, 
excluding jets with \r]\ > 2.5. For background jets we use the hardest jet in each event 
of a Standard Model dijet sample, again excluding jets with \t]\ > 2.5. For both event 
samples there are (9(10 4 ) events in each px bin of width 100 GeV. For jet clustering we 
use the CA algorithm [TJJ with R = 1.5 as implemented by Fast Jet 2.4.2 [37]. In 
order to simulate the finite resolution of the ATLAS or CMS calorimeters, particles in 
each event are clustered into 0.1 x 0.1 cells in (77, 0) and then combined into massless 
four- vector pseudoparticles that are fed into Fast Jet. For each p? window the cuts 
are chosen to yield the smallest background efficiency at each fixed signal efficiency 
es- This optimization is performed by a custom Monte Carlo code that finely samples 
the space of cuts. Some sample values for the different cuts are given in Table [Tj 



In Fig. 11(a) and Fig. 11(b) we illustrate the performance of the top tagger. The 
performance is comparable to other top taggers in the literature [BUS] I2"T ] I3"8"H4"2] . with 
€b ~ 5% for es = 50% and €3 ~ 0.5% for es = 20% [36]. For a fixed signal efficiency, 
the background efficiency is approximately flat across the pt range we have tested, 
200 GeV < pt < 800 GeV. In Table [T] we see that in the n p = 2 and especially n p = 3 
bins, where correspondingly more observables are available for discrimination, the top 
tagger is able to attain large signal efficiencies. Because the net signal and background 
efficiencies are obtained by combining all three n p bins, the largest contribution to 
is actually from the n p = 2 bin, since the plurality of top jets land in the n p = 2 bin 



for h = 4.0 (see Fig. 10). For example, at e s = 50% and for 500 GeV < p T < 600 
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Figure 11: The performance of the top tagger as given by the HERWIG event samples. 
The background efficiency vs. signal efficiency for our top tagger is compared to other 
algorithms in the literature in (a). This figure is reproduced from [36J with the results 
from our tagger added. Here the candidate jets have transverse momenta 500 GeV < pr < 
600 GeV. For Fig. (a) only, candidate jets have been clustered with the anti-kT algorithm 
with R = 1.0, as was done in the BOOST study. As a consequence the performance in 
(a) is better than in (b), where the large jet radius degrades top mass resolution. In (b) 
the background efficiency is plotted as a function of px for signal efficiencies of es = 50% 
(black), 40% (blue), 30% (green) and 20% (red). Efficiencies at a given pxo are calculated 
from a pr window of 100 GeV centered at pto- Note that, as a consequence, each point is 
not statistically independent. Error bands are statistical. 



n p = 1 


min 


"It max 


pmax 


m min 


es(%) 


e B {%) 


300 


- 400 GeV 


177 GeV 


300 GeV 


0.96 


78 GeV 


23.8 


1.9 


500 


- 600 GeV 


175 GeV 


300 GeV 


0.57 


74 GeV 


27.0 


2.6 



n p = 2 


"It min 


Dmax 
K l* 


Dmax 
K 2* 


m min 




es(%) 


e B {%) 


300 


- 400 GeV 


157 GeV 


0.85 


1.59 


30 GeV 


77 GeV 


57.2 


11.4 


500 


- 600 GeV 


159 GeV 


0.57 


1.00 


36 GeV 


55 GeV 


59.6 


9.8 



n p = 3 


"It min 


Dmax 
K l* 


Dmax 


Dmax 




m min 


e s (%) 


e B (%) 


300 


- 400 GeV 


102 GeV 


0.81 


1.03 


2.11 


26 GeV 


79 GeV 


82.9 


15.9 


500 


- 600 GeV 


155 GeV 


0.62 


0.66 


1.35 


46 GeV 


73 GeV 


73.6 


7.9 



Table 1: Sample optimized cut parameters at a (total) signal efficiency of es = 50% for two 
different pt bins. In the rightmost column we show the signal and background efficiencies 
obtained within each n p bin taken separately; i.e. these numbers do not take into account 
what fraction of candidate jets end up in each n p bin. Signal efficiency increases substantially 
with n D . 
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Figure 12: Signal versus background efficiency curves for HERWIG (blue) and PYTHIA (red) 
event samples in the 500 GeV < px < 600 GeV px bin. Error bands are statistical. 



GeV about 55% of tagged top jets come from the n p = 2 bin, while about 20% and 
25% come from the n p — 1 and n p = 3 bins, respectively. Similarly, the background 
efficiency is lowest in the n p — 1 bin; only QCD jets with two or three prominent 
peaks do a good job of faking the substructure of a top jet. For example, at = 50% 
and for 500 GeV < p T < 600 GeV about 32%, 54%, and 14% of tagged QCD jets 
come from the n p = 1, n p = 2, and n p = 3 bins, respectively, even though only about 
31% of QCD jets fall in the n p = 2 or 3 bins. 



As a crosscheck in Fig. 12 we compare the performance of the top tagging algo- 
rithm between the HERWIG and PYTHIA event samples. We see that the background 
efficiency is generally lower for HERWIG than it is for PYTHIA. One possible reason 
for this is that that although the cut parameters have been separately optimized for 
both event generators, the parameters ho = 4.0 and dR = 0.06 were optimized on the 
basis of the HERWIG event samples. The HERWIG and PYTHIA event samples already 
disagree at the level of the n p distributions, and this disagreement persists in the 
absence of the underlying event. This means that the typical prominence of peaks in 
^Q{TV) differs between the two event samples. It would be interesting to understand 
in detail which features of the two event generators (the parton shower description, 
the underlying event model, etc.) contribute to this disagreement. Going further in 
this direction, however, lies outside the scope of this paper. 

Given the large number of cut parameters that enter into the top tagging algo- 
rithm, overtraining is a concern. By training the cut parameters on a subset A of 
the event samples and testing the resulting cuts on subsets £>j disjoint from A, we 
can get some idea for how susceptible the quoted efficiencies are to overtraining. We 
find that the variation in the background efficiency e# (at fixed 6$) that results from 
this validation procedure is comparable to the quoted statistical uncertainties. This 
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additional uncertainty should be kept in mind when considering the absolute per- 
formance of the top tagger. Since precise estimates for background efficiencies are 
made difficult by other uncertainties, such as those which enter the modeling of QCD 
backgrounds or detector mock-up, we do not consider overtraining any further. 

Our simple mock calorimeter does not account for a variety of detector effects. 
Recent studies at the LHC (see e.g. [13]) suggest that Monte Carlo tools provide a 
fair description of the performance of jet substructure algorithms. Since the algo- 
rithm discussed in this paper relies on kinematic observables, we suspect that the 
performance of the algorithm will not be exceedingly sensitive to detector effects. 
As a consequence the tagging efficiencies quoted above should be fair estimates of 
what can be expected in the absence of pile-up. Sensitivity to pile-up requires fur- 
ther study, and a full detector simulation would be required to better understand the 
expected performance of the top tagging algorithm. Aspects of the algorithm may 
also be amenable to a sideband analysis. In particular, by looking at regions of the 
Ri*-mi* plane (see Fig. [7]) away from the signal region the shape of the background 
distributions can be extrapolated into the signal region. 



4 Discussion 

By sorting jets according to the number of prominent peaks identified in their 
angular structure functions AQ(R) and making rectangular cuts on the angular and 
mass scales and m^, we have been able to construct an efficient top tagging algo- 
rithm. Since the focus of this paper has been to demonstrate that AQ(R) can be used 
to identify angular and mass scales in jets, the particular algorithm we have described 
was chosen for its simplicity. A number of possible improvements to the algorithm 
suggest themselves, however, even leaving aside modifications that are unrelated to 
the use of AQ(R). One possible concern is the large number of cut parameters that 
result from using three peak bins. Given the strong correlations between the R^ and 
mj* (see Fig. [7]), one way to reduce the total number of free parameters would be to 
consolidate some of the variables. For example, one could replace separate cuts on R^ 
and TOj* with a single cut on m^/R^. One could also investigate different schemes for 
binning identified peaks in AQ(R). For example, the expected substructure of a top 
might be better captured by sorting into bins {n p o, n p i}, where bin {n p0 , n p i} contains 
n P Q peaks with prominence P > h and n p \ peaks with prominence h\ < P < h . 
The definition of the partial mass in Eq. |3j which is most accurate for narrow sub- 
jets, could be improved to better capture the invariant mass of wide subjets. The 
particular way in which we organize the observables R^ and according to their 
ordering in R as well as the use of topographic prominence to identify peaks could 
also be revisited. Since AQ (R) defines a continuous number of observables, this list of 
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possible modifications could go on indefinitely, and it is interesting to ask whether our 
simple procedure makes efficient use of the information available from Q{R). Going 
further in this direction, however, lies outside the scope of this paper. 

Although we have explored the use of the angular correlation function Q(R) and 
the angular structure function AQ(R) for the particular application of top tagging, 
the generality of the resulting procedure suggests that it could be useful in a variety 
of different contexts. It seems likely that procedures that make use of AQ(R) will 
be most effective when accurate reconstruction of angular scales is valuable. Some 
interesting possibilities include: 

• using observables defined from AQ(R) to probe QCD; for example, measure- 
ments of -R* or n p distributions for QCD jets could be compared against Monte 
Carlo calculations 

• using R* distributions to search for new physics (angular bumps instead of mass 
bumps); this is attractive, since accurate mass reconstruction is difficult 

• calculating AQ(R) for the event as a whole and using the identified angular 
scales to determine an appropriate jet radius parameter R event-by-event 

• using AQ(R) to access helicity/spin information in jetty cascades 

• generalizing Q{R) to some kind of n-particle correlation function, which might 
prove to be useful in the context of n-body decays 

• using AQ(R) to zoom in on the prominent angular scales within a jet and 
defining some kind of 'angular filtering' procedure to improve mass resolution 

• using Q(R) to study correlations in the underlying event 

By performing what is essentially an 'angular fourier transform' on the constituents 
of a jet, AQ(R) provides a convenient way of accessing angular and mass scales within 
jets. These angular and mass scales can be used to characterize the substructure of a 
jet. Further work will be needed to determine the extent to which the ideas explored 
in this paper can be applied more generally. 
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A Top Quark Decay Kinematics 



If we make some simplifying assumptions about the kinematics of top quark de- 
cays, then we can derive compact formulas for the angular scales where we expect 
top jets to have significant substructure. To do so we first work in the approximation 
that both the top and the W decay isotropically in their rest frames. Then working 
in the limit of large transverse momenta, we can approximate the typical momentum 
fractions of the decay products of the top in the lab frame as 



z W i — Zw2 — ; 

A typical configuration 
a line with 



-zw 



m\ + m^y 
4m| 



0.30 



m\ - m\r 
2m 2 t 



0.40 



(6) 



has the decay products approximately distributed along 

(7) 



Rbi < R\2 < Rb2 
Assuming that the decay topology is exactly line-like with 

Rb2 = R12 + Rbl 
we can use mass constraints to determine the Ri* and m« 





i = 


1 


i = 2 


i = 3 






-raw 


2m t 2 2m w 


2"ij mt+raw 


Pt rnf 


+™ w 


p T m 2 t +m 2 w 


p T mj+m'^ 


mi 


(m t -m w ) 2 


m 2 t -m 2 w 




(m t +m w ) 2 m 2 -w? w 


2 


m l+ m w 


2 m'f+m 2 ^- 



where pt is the transverse momentum of the top quark. Numerical values of these 
expressions for p-r = 550 GeV are given in Sec. 3.2. 
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